F0 transformation within the voice conversion framework

نویسندگان

  • Zdenek Hanzlícek
  • Jindrich Matousek
چکیده

In this paper, several experiments on F0 transformation within the voice conversion framework are presented. The conversion system is based on a probabilistic transformation of line spectral frequencies and residual prediction. Three probabilistic methods of instantaneous F0 transformation are described and compared. Moreover, a new modification of inter-speaker residual prediction is proposed which utilizes the information on target F0 directly during the determination of suitable residuum. Preference listening tests confirmed that this modification outperformed the standard version of residual prediction.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Voice Conversion Based on Probabilistic Parameter Transformation and Extended Inter-speaker Residual Prediction

Voice conversion is a process which modifies speech produced by one speaker so that it sounds as if it is uttered by another speaker. In this paper a new voice conversion system is presented. The system requires parallel training data. By using linear prediction analysis, speech is described with line spectral frequencies and the corresponding residua. LSFs are converted together with instantan...

متن کامل

Text-independent F0 transformation with non-parallel data for voice conversion

In voice conversion, a simple frame-level mean and variance normalization is typically used for fundamental frequency (F0) transformation, which is text-independent and requires no parallel training data. Some advanced methods transform pitch contours instead, but require either parallel training data or syllabic annotations. We propose a method which retains the simplicity and text-independenc...

متن کامل

Data-driven emotion conversion in spoken English

This paper describes an emotion conversion system that combines independent parameter transformation techniques to endow a neutral utterance with a desired target emotion. A set of prosody conversion methods have been developed which utilise a small amount of expressive training data ( 15 min) and which have been evaluated for three target emotions: anger, surprise and sadness. The system perfo...

متن کامل

Hierarchical modeling of F0 contours for voice conversion

Voice conversion systems deal with the conversion of a speech signal to sound as if it was uttered by another speaker. The conversion of the spectral features has attracted a lot of research attention but the conversion of pitch, modeling the speakerdependent prosody, is often achieved by just controlling the F0 level and range. However, the detailed prosody, including different linguistic unit...

متن کامل

Emotional Voice Conversion Using Neural Networks with Different Temporal Scales of F0 based on Wavelet Transform

An artificial neural network is one of the most important models for training features of voice conversion (VC) tasks. Typically, neural networks (NNs) are very effective in processing nonlinear features, such as mel cepstral coefficients (MCC) which represent the spectrum features. However, a simple representation for fundamental frequency (F0) is not enough for neural networks to deal with an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007